# CS:APP Chapter 4 Computer Architecture Sequential Implementation









# **Building Blocks**

- Combinational Logic
  - Compute Boolean functions of inputs
  - Continuously respond to input changes
  - Operate on data and implement control
- Storage Elements
  - Store bits
  - Addressable memories
  - Non-addressable registers
  - Loaded only as clock rises





# Hardware Control Language

- Very simple hardware description language
- Can only express limited aspects of hardware operation
  - Parts we want to explore and modify

#### Data Types

- bool: Boolean
  - a, b, c, ...
- int: words
  - A, B, C, ...
  - Does not specify word size---bytes, 32-bit words, ...

#### Statements

- bool a = bool-expr;
- int A = int-expr;

## **HCL** Operations

Classify by type of value returned

#### Boolean Expressions

- Logic Operations
  - a && b, a || b,!a
- Word Comparisons
  - $\bullet$  A == B, A != B, A < B, A <= B, A >= B, A > B
- Set Membership
  - A in { B, C, D }
    - Same as  $A == B \mid \mid A == C \mid \mid A == D$

# Word Expressions

- Case expressions
  - [ a : A; b : B; c : C ]
  - Evaluate test expressions a, b, c, ... in sequence
  - Return word expression A, B, C, ... for first successful test

#### SEQ Hardware Structure

- State
  - Program counter register (PC)
  - Condition code register (CC)
  - Register File
  - Memories
    - Access same memory space
    - Data: for reading/writing program data
    - Instruction: for reading instructions
- Instruction Flow
  - Read instruction at address specified by PC
  - Process through stages
  - Update program counter



# **SEQ Stages**

- Fetch
  - Read instruction from instruction memory
- Decode
  - Read program registers
- Execute
  - Compute value or address
- Memory
  - Read or write data
- Write Back
  - Write program registers
- PC
  - Update program counter



# Instruction Decoding



#### • Instruction Format

- Instruction byte
- Optional register byte
- Optional constant word

icode:ifun

rA:rB

valC

# Executing Arith./Logical Operation



- Fetch
  - Read 2 bytes
- Decode
  - Read operand registers
- Execute
  - Perform operation
  - Set condition codes

- Memory
  - Do nothing
- Write back
  - Update register
- PC Update
  - Increment PC by 2

#### Stage Computation: Arith/Log. Ops

|           | OPq rA, rB          |  |
|-----------|---------------------|--|
|           | icode:ifun ← M₁[PC] |  |
| Fetch     | rA:rB ← M₁[PC+1]    |  |
|           | valP ← PC+2         |  |
| Decode    | valA ← R[rA]        |  |
| Decode    | valB ← R[rB]        |  |
| Execute   | valE ← valB OP valA |  |
| LACCULC   | Set CC              |  |
| Memory    |                     |  |
| Write     | R[rB] ← valE        |  |
| back      |                     |  |
| PC update | PC ← valP           |  |

Read instruction byte Read register byte

Compute next PC
Read operand A
Read operand B
Perform ALU operation
Set condition code register

Write back result

**Update PC** 

- Formulate instruction execution as sequence of simple steps
- Use same general form for all instructions

#### Executing rmmovq

rmmovq rA, D(rB) 4 0 rA rB D

- Fetch
  - Read 10 bytes
- Decode
  - Read operand registers
- Execute
  - Compute effective address

- Memory
  - Write to memory
- Write back
  - Do nothing
- PC Update
  - Increment PC by 10

## Stage Computation: rmmovq

|           | rmmovq rA, D(rB)             |  |
|-----------|------------------------------|--|
|           | icode:ifun ← M₁[PC]          |  |
| Fetch     | rA:rB ← M₁[PC+1]             |  |
| retcii    | valC ← M <sub>8</sub> [PC+2] |  |
|           | valP ← PC+10                 |  |
| Dagada    | valA ← R[rA]                 |  |
| Decode    | valB ← R[rB]                 |  |
| Execute   | valE ← valB + valC           |  |
| Memory    | M <sub>8</sub> [valE] ← valA |  |
| Write     |                              |  |
| back      |                              |  |
| PC update | PC ← valP                    |  |

Read instruction byte
Read register byte
Read displacement D
Compute next PC
Read operand A
Read operand B
Compute effective address

Write value to memory

**Update PC** 

Use ALU for address computation

#### Executing popq



- Fetch
  - Read 2 bytes
- Decode
  - Read stack pointer
- Execute
  - Increment stack pointer by 8

- Memory
  - Read from old stack pointer
- Write back
  - Update stack pointer
  - Write result to register
- PC Update
  - Increment PC by 2

#### Stage Computation: popq

|           | popq rA                      |  |
|-----------|------------------------------|--|
|           | icode:ifun ← M₁[PC]          |  |
| Fetch     | rA:rB ← M₁[PC+1]             |  |
|           | valP ← PC+2                  |  |
| Decode    | valA ← R[%rsp]               |  |
| Decode    | valB ← R[%rsp]               |  |
| Execute   | valE ← valB + 8              |  |
| Memory    | valM ← M <sub>8</sub> [valA] |  |
| Write     | R[%rsp] ← valE               |  |
| back      | R[rA] ← valM                 |  |
| PC update | PC ← valP                    |  |

Read instruction byte Read register byte

Compute next PC
Read stack pointer
Read stack pointer
Increment stack pointer

Read from stack
Update stack pointer
Write back result
Update PC

- Use ALU to increment stack pointer
- Must update two registers
  - Popped value
  - New stack pointer

# **Executing Conditional Moves**

cmovXX rA, rB 2 fn rA rB

- Fetch
  - Read 2 bytes
- Decode
  - Read operand registers
- Execute
  - If !cnd, then set destination register to 0xF

- Memory
  - Do nothing
- Write back
  - Update register (or not)
- PC Update
  - Increment PC by 2

#### Stage Computation: Cond. Move

|           | cmovXX rA, rB               |  |
|-----------|-----------------------------|--|
| Fetch     | icode:ifun ← M₁[PC]         |  |
|           | rA:rB ← M₁[PC+1]            |  |
|           | valP ← PC+2                 |  |
| Decode    | valA ← R[rA]                |  |
| Decode    | valB ← 0                    |  |
| Execute   | valE ← valB + valA          |  |
| LACCUIC   | If ! Cond(CC,ifun) rB ← 0xF |  |
| Memory    |                             |  |
| Write     | R[rB] ← valE                |  |
| back      |                             |  |
| PC update | PC ← valP                   |  |

Read instruction byte Read register byte

Compute next PC Read operand A

Pass valA through ALU (Disable register update)

Write back result

**Update PC** 

- Read register rA and pass through ALU
- Cancel move by setting destination register to 0xF
  - If condition codes & move condition indicate no move

#### **Executing Jumps**



- Fetch
  - Read 9 bytes
  - Increment PC by 9
- Decode
  - Do nothing
- Execute
  - Determine whether to take branch based on jump condition and condition codes

- Memory
  - Do nothing
- Write back
  - Do nothing
- PC Update
  - Set PC to Dest if branch taken or to incremented PC if not branch

#### Stage Computation: Jumps

|           | jXX Dest                                 |                                               |
|-----------|------------------------------------------|-----------------------------------------------|
|           | icode:ifun ← M₁[PC]                      | Read instruction byte                         |
| Fetch     | valC ← M <sub>8</sub> [PC+1] valP ← PC+9 | Read destination address Fall through address |
| Decode    |                                          |                                               |
| Execute   | Cnd ← Cond(CC,ifun)                      | Take branch?                                  |
| Memory    |                                          |                                               |
| Write     |                                          |                                               |
| back      |                                          |                                               |
| PC update | PC ← Cnd ? valC : valP                   | Update PC                                     |

- Compute both addresses
- Choose based on setting of condition codes and branch condition

#### Executing call



- Fetch
  - Read 9 bytes
  - Increment PC by 9
- Decode
  - Read stack pointer
- Execute
  - Decrement stack pointer by 8

- Memory
  - Write incremented PC to new value of stack pointer
- Write back
  - Update stack pointer
- PC Update
  - Set PC to Dest

#### Stage Computation: call

|           | call Dest                                                                                                        |  |
|-----------|------------------------------------------------------------------------------------------------------------------|--|
| Fetch     | icode:ifun $\leftarrow$ M <sub>1</sub> [PC]<br>valC $\leftarrow$ M <sub>8</sub> [PC+1]<br>valP $\leftarrow$ PC+9 |  |
| Decode    | valB ← R[%rsp]                                                                                                   |  |
| Execute   | valE ← valB + -8                                                                                                 |  |
| Memory    | M <sub>8</sub> [valE] ← valP                                                                                     |  |
| Write     | R[%rsp] ← valE                                                                                                   |  |
| back      |                                                                                                                  |  |
| PC update | PC ← valC                                                                                                        |  |

Read instruction byte

Read destination address
Compute return point

Read stack pointer

Decrement stack pointer

Write return value on stack Update stack pointer

**Set PC to destination** 

- Use ALU to decrement stack pointer
- Store incremented PC

#### Executing ret



- Fetch
  - Read 1 byte
- Decode
  - Read stack pointer
- Execute
  - Increment stack pointer by 8

- Memory
  - Read return address from old stack pointer
- Write back
  - Update stack pointer
- PC Update
  - Set PC to return address

#### Stage Computation: ret

|           | ret                           |  |
|-----------|-------------------------------|--|
| Fetch     | icode:ifun ← M₁[PC]           |  |
| Decode    | valA ← R[%rsp] valB ← R[%rsp] |  |
| Execute   | valE ← valB + 8               |  |
| Memory    | valM ← M <sub>8</sub> [valA]  |  |
| Write     | R[%rsp] ← valE                |  |
| back      |                               |  |
| PC update | PC ← valM                     |  |

**Read instruction byte** 

Read operand stack pointer Read operand stack pointer Increment stack pointer

Read return address Update stack pointer

**Set PC to return address** 

- Use ALU to increment stack pointer
- Read return address from memory

#### Computation Steps

|           |            | OPq rA, rB          |
|-----------|------------|---------------------|
|           | icode,ifun | icode:ifun ← M₁[PC] |
| Fetch     | rA,rB      | rA:rB ← M₁[PC+1]    |
| rettii    | valC       |                     |
|           | valP       | valP ← PC+2         |
| Decode    | valA, srcA | valA ← R[rA]        |
| Decode    | valB, srcB | valB ← R[rB]        |
| Execute   | valE       | valE ← valB OP valA |
| Execute   | Cond code  | Set CC              |
| Memory    | valM       |                     |
| Write     | dstE       | R[rB] ← valE        |
| back      | dstM       |                     |
| PC update | PC         | PC ← valP           |

**Read instruction byte** Read register byte [Read constant word] **Compute next PC Read operand A** Read operand B **Perform ALU operation** Set/use cond. code reg [Memory read/write] Write back ALU result [Write back memory result] **Update PC** 

- All instructions follow same general pattern
- Differ in what gets computed on each step

#### Computation Steps

|           |            | call Dest                    |
|-----------|------------|------------------------------|
|           | icode,ifun | icode:ifun ← M₁[PC]          |
| Fetch     | rA,rB      |                              |
| reich     | valC       | valC ← M <sub>8</sub> [PC+1] |
|           | valP       | valP ← PC+9                  |
| Decode    | valA, srcA |                              |
| Decode    | valB, srcB | valB ← R[%rsp]               |
| Execute   | valE       | valE ← valB + -8             |
| Execute   | Cond code  |                              |
| Memory    | valM       | M <sub>8</sub> [valE] ← valP |
| Write     | dstE       | R[%rsp] ← valE               |
| back      | dstM       |                              |
| PC update | PC         | PC ← valC                    |

**Read instruction byte** [Read register byte] Read constant word **Compute next PC** [Read operand A] Read operand B **Perform ALU operation** [Set /use cond. code reg] Memory read/write Write back ALU result [Write back memory result] **Update PC** 

- All instructions follow same general pattern
- Differ in what gets computed on each step

#### Computed Values

#### Fetch

icode Instruction code

ifun Instruction function

rA Instr. Register A

rB Instr. Register B

valC Instruction constant

valP Incremented PC

#### Decode

srcA Register ID A

srcB Register ID B

dstE Destination Register E

dstM Destination Register M

valA Register value A

valB Register value B

#### Execute

valE ALU result

Cnd Branch/move flag

#### Memory

• valM Value from memory

#### **SEQ Hardware**

- Key
  - Blue boxes: predesigned hardware blocks

PC

Memory

Execute

Decode

Fetch

- E.g., memories, ALU
- Gray boxes: control logic
  - Describe in HCL
- White ovals: labels for signals
- Thick lines: 64-bit word values
- Thin lines: 4-8 bit values
- Dotted lines: 1-bit values

data out Data memory valB (dstE (dstM) A B Register Write back rВ icode ( ifun Imem error PC Instruction memory increment PC

# Fetch Logic



- Predefined Blocks
  - PC: Register containing PC
  - Instruction memory: Read 10 bytes (PC to PC+9)
    - Signal invalid address
  - Split: Divide instruction byte into icode and ifun
  - Align: Get fields for rA, rB, and valC

# Fetch Logic



- Control Logic
  - Instr. Valid: Is this instruction valid?
  - icode, ifun: Generate no-op if invalid address
  - Need regids: Does this instruction have a register byte?
  - Need valC: Does this instruction have a constant word?

# Fetch Control Logic in HCL



# Fetch Control Logic in

HCL halt nop cmovXX rA, rB fn rA rB irmovq V, rB rΒ 8 V rA rB rmmovq rA, D(rB) D mrmovq D(rB), rA rA rB D fn rA rB OPq rA, rB jxx Dest fn Dest call Dest Dest ret popq rA popq rA bool need regids = icode in { IRRMOVQ, IOPQ, IPUSHQ, IPOPQ, IIRMOVQ, IRMMOVQ, IMRMOVQ }; bool instr valid = icode in { INOP, IHALT, IRRMOVQ, IIRMOVQ, IRMMOVQ, IMRMOVQ, IOPQ, IJXX, ICALL, IRET, IPUSHQ, IPOPQ };

#### Decode Logic

#### Register File

- Read ports A, B
- Write ports E, M
- Addresses are register IDs or 15 (0xF) (no access)

#### **Control Logic**

- srcA, srcB: read port addresses
- dstE, dstM: write port addresses

#### Signals

- Cnd: Indicate whether or not to perform conditional move
  - Computed in Execute stage



#### A Source

];

```
OPq rA, rB
                    Decode
                                 valA \leftarrow R[rA]
                                                               Read operand A
                                 cmovXX rA, rB
                    Decode
                                 valA \leftarrow R[rA]
                                                               Read operand A
                                 rmmovq rA, D(rB)
                    Decode
                                 valA \leftarrow R[rA]
                                                               Read operand A
                                 popq rA
                    Decode
                                 valA \leftarrow R[\$rsp]
                                                               Read stack pointer
                                 jXX Dest
                    Decode
                                                               No operand
                                 call Dest
                    Decode
                                                               No operand
                                 ret
                    Decode
                                 valA \leftarrow R[\$rsp]
                                                               Read stack pointer
int srcA = [
         icode in { IRRMOVQ, IRMMOVQ, IOPQ, IPUSHQ } : rA;
         icode in { IPOPQ, IRET } : RRSP;
         1 : RNONE; # Don't need register
```

#### E Destination

int dstE = [

];

```
OPq rA, rB
          Write-back
                      R[rB] \leftarrow valE
                                                    Write back result
                      cmovXX rA, rB
                                                    Conditionally write
          Write-back
                                                    back result
                      R[rB] \leftarrow valE
                      rmmovq rA, D(rB)
          Write-back
                                                    None
                      popq rA
          Write-back
                      R[\$rsp] \leftarrow valE
                                                    Update stack pointer
                      iXX Dest
          Write-back
                                                    None
                      call Dest
          Write-back
                      R[\$rsp] \leftarrow valE
                                                    Update stack pointer
                      ret
          Write-back R[%rsp] ← valE
                                                    Update stack pointer
icode in { IRRMOVQ } && Cnd : rB;
icode in { IIRMOVQ, IOPQ} : rB;
icode in { IPUSHQ, IPOPQ, ICALL, IRET } : RRSP;
1 : RNONE; # Don't write any register
```

## **Execute Logic**

- Units
  - ALU
    - Implements 4 required functions
    - Generates condition code values
  - CC
    - Register with 3 condition code bits
  - cond
    - Computes conditional jump/move flag
- Control Logic
  - Set CC: Should condition code register be loaded?
  - ALU A: Input A to ALU
  - ALU B: Input B to ALU
  - ALU fun: What function should ALU compute?



# ALU A Input

int aluA = [

];

```
OPq rA, rB
          Execute
                     valE ← valB OP valA
                                                 Perform ALU operation
                     cmovXX rA, rB
          Execute
                     valE \leftarrow 0 + valA
                                                 Pass valA through ALU
                     rmmovq rA, D(rB)
          Execute
                     valE ← valB + valC
                                                 Compute effective address
                     popq rA
          Execute
                     valE \leftarrow valB + 8
                                                 Increment stack pointer
                     iXX Dest
          Execute
                                                 No operation
                     call Dest
          Execute
                     valE \leftarrow valB + -8
                                                 Decrement stack pointer
                     ret
          Execute
                     valE ← valB + 8
                                                 Increment stack pointer
icode in { IRRMOVQ, IOPQ } : valA;
icode in { IIRMOVQ, IRMMOVQ, IMRMOVQ } : valC;
icode in { ICALL, IPUSHQ } : -8;
icode in { IRET, IPOPQ } : 8;
# Other instructions don't need ALU
```

# ALU Operation

|                       | OPI rA, rB          |                           |
|-----------------------|---------------------|---------------------------|
| Execute               | valE ← valB OP valA | Perform ALU operation     |
|                       | cmovXX rA, rB       |                           |
| Execute               | valE ← 0 + valA     | Pass valA through ALU     |
|                       | rmmovl rA, D(rB)    |                           |
| Execute               | valE ← valB + valC  | Compute effective address |
|                       | popq rA             |                           |
| Execute               | valE ← valB + 8     | Increment stack pointer   |
|                       | jXX Dest            |                           |
| Execute               |                     | No operation              |
|                       | call Dest           |                           |
| Execute               | valE ← valB + -8    | Decrement stack pointer   |
|                       | ret                 |                           |
| Execute               | valE ← valB + 8     | Increment stack pointer   |
| int alufun            | = [                 |                           |
| icode == IOPQ : ifun; |                     |                           |
| 1:                    |                     |                           |
| 1;                    | •                   |                           |

### Memory Logic

## Memory

Reads or writes memory word

## Control Logic

- stat: What is instruction status?
- Mem. read: should word be read?
- Mem. write: should word be written?
- Mem. addr.: Select address
- Mem. data.: Select data



#### Instruction Status

# Control Logic

stat: What is instruction status?



```
## Determine instruction status
int Stat = [
         imem_error || dmem_error : SADR;
         !instr_valid: SINS;
         icode == IHALT : SHLT;
         1 : SAOK;
];
```

### Memory Address

];

```
OPq rA, rB
         Memory
                                                    No operation
                      rmmovq rA, D(rB)
         Memory
                      M_8[valE] \leftarrow valA
                                                    Write value to memory
                      popq rA
         Memory
                      valM \leftarrow M_8[valA]
                                                    Read from stack
                      jXX Dest
         Memory
                                                    No operation
                      call Dest
         Memory
                      M_8[valE] \leftarrow valP
                                                    Write return value on stack
                      ret
         Memory
                      valM \leftarrow M_8[valA]
                                                    Read return address
int mem addr = [
         icode in { IRMMOVQ, IPUSHQ, ICALL, IMRMOVQ } : valE;
         icode in { IPOPQ, IRET } : valA;
         # Other instructions don't need address
```

### Memory Read

```
OPq rA, rB
Memory
                                              No operation
             rmmovq rA, D(rB)
Memory
              M_8[valE] \leftarrow valA
                                              Write value to memory
             popq rA
Memory
             valM \leftarrow M_8[valA]
                                              Read from stack
             iXX Dest
Memory
                                              No operation
             call Dest
             M_8[valE] \leftarrow valP
Memory
                                              Write return value on stack
             ret
Memory
             valM \leftarrow M_8[valA]
                                              Read return address
```

bool mem\_read = icode in { IMRMOVQ, IPOPQ, IRET };

# PC Update Logic

- New PC
  - Select next value of PC



# PC Update

|                                                 | OPq rA, rB             |                          |  |
|-------------------------------------------------|------------------------|--------------------------|--|
| PC update                                       | PC ← valP              | Update PC                |  |
|                                                 | A D(D)                 | !<br>                    |  |
|                                                 | rmmovq rA, D(rB)       |                          |  |
| PC update                                       | PC ← valP              | Update PC                |  |
|                                                 | popq rA                |                          |  |
| PC update                                       | PC ← valP              | Update PC                |  |
|                                                 | jXX Dest               |                          |  |
| PC undata                                       | PC ← Cnd ? valC : valP | Undata BC                |  |
| PC update                                       | PC 	Clid ? ValC . ValP | Update PC                |  |
|                                                 | call Dest              |                          |  |
| PC update                                       | PC ← valC              | Set PC to destination    |  |
|                                                 | ret                    |                          |  |
| PC update                                       | PC ← valM              | Set PC to return address |  |
|                                                 | _                      |                          |  |
| <pre>int new_pc = [</pre>                       |                        |                          |  |
| icode == ICALL : valC;                          |                        |                          |  |
| <pre>icode == IJXX &amp;&amp; Cnd : valC;</pre> |                        |                          |  |
| <pre>icode == IRET : valM;</pre>                |                        |                          |  |
| 1                                               | : valP;                |                          |  |
| ];                                              |                        |                          |  |



#### State

- PC register
- Cond. Code register
- Data memory
- Register file

All updated as clock rises

- Combinational Logic
  - ALU
  - Control logic
  - Memory reads
    - Instruction memory
    - Register file
    - Data memory





- state set according to second irmovq instruction
- combinational logic starting to react to state changes





- state set according to second irmovq instruction
- •combinational logic generates results for addq instruction





- state setaccording toaddq instruction
- combinational logic starting to react to state changes





- state set according to addq instruction
- combinational logic generates results for je instruction

## **SEQ Summary**

# Implementation

- Express every instruction as series of simple steps
- Follow same general flow for each instruction type
- Assemble registers, memories, predesigned combinational blocks
- Connect with control logic

### Limitations

- Too slow to be practical
- In one cycle, must propagate through instruction memory, register file, ALU, and data memory
- Would need to run clock very slowly
- Hardware units only active for fraction of clock cycle